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Abstract 

Data warehouses store large volumes of data 
which are used frequently by decision sup- 
port applications. Such applications involve 
complex queries. Query performance in such 
an environment is critical because decision 
support applications often require interactive 
query response time. Because data ware- 
houses are updated infrequently, it becomes 
possible to improve query performance by 
caching sets retrieved by queries in addi- 
tion to query execution plans. In this pa- 
per we report on the design of an intelligent 
cache manager for sets retrieved by queries 
called WATCHMAN, which is particularly 
well suited for data warehousing environment. 

Our cache manager employs two novel, com- 
plementary algorithms for cache replacement 
and for cache admission. WATCHMAN aims 
at minimizing query response time and its 
cache replacement policy swaps out entire re- 
trieved sets of queries instead of individual 
pages. The cache replacement and admission 
algorithms make use of a profit metric, which 
considers for each retrieved set its average rate 
of reference, its size, and execution cost of 
the associated query. We report on a perfor- 
mance evaluation based on the TPC-D and 
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Set Query benchmarks. These experiments 
show that WATCHMAN achieves a substan- 
tial performance improvement in a decision 
support environment when compared to a tra- 
ditional LRU replacement algorithm. 

1 Introduction 

A data warehouse is a stand-alone repository of in- 
formation integrated from several, possibly heteroge- 
neous, operational databases [IK93, Wid95]. Data 
warehouses are usually dedicated to the processing 
of data analysis and decision support system (DSS) 
queries. Unlike online transaction processing (OLTP) 
queries which access only a few tuples in each relation, 
DSS queries are much more complex and access a sub- 
stantial part of the data stored in the warehouse. Con- 
sequently, the response time of DSS queries is several 
orders of magnitude higher than the response time of 
OLTP queries. In order to support interactive query 
processing, most commercial data warehouses incor- 
porate parallel processing techniques as well as effi- 
cient indexing techniques, such as bit maps, which are 
geared towards keeping the response time at an ac- 
ceptable level. 

Compared to OLTP systems, data warehouses are 
relatively static with only infrequent updates [IK93, 
Fre95]. Consequently, the query engine may bene- 
fit from caching at multiple levels: execution plans, 
access paths and the actual retrieved sets of queries 
[RCK+95]. Caching of the sets retrieved by queries 
is particularly attractive in the warehousing environ- 
ment because DSS queries retrieve relatively small sets 
of statistical data such as averages, sums, counts, etc. 
DDS queries often follow a hierarchical "drill-down 
analysis” pattern [IK93], where a query on each level 
is a refinement of some query on the previous level. 
Therefore, caching retrieved sets of queries at higher 
levels is especially effective because such queries are 
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likely to occur frequently in a multiuser environment. 

Cache replacement algorithms play a central role 
in the design of any cache manager; these algo 
rithms have been extensively studied in the context 
of operating system virtual ^^ory m^agement an 
database buffer management [CD73, LWF77, Sto , 
EH84 CD85.00W93]. Cache replacement algorithms 
usually maximize the cache hit ratio, by attempting to 
cache the most frequently referenced pages. However, 
the real goal of caching is to improve some performance 
metric based on response time or throughput. A page 
replacement algorithm based on hit ratio optimization 
can be used for response time minimization in a re- 
trieved set cache only if all retrieved sets of queries 
arP of an eoual size and all queries incur the same cost 


of execution. 

In this paper we report on the design of an intelli- 
gent cache manager of retrieved sets of caUed 

WATCHMAN (WArehouse inTelligent CacHe MAIN- 
ager). WATCHMAN employs a novel cache replace- 
ment algorithm which makes use of a “profit metric 
which considers for each retrieved set its average rate 
of reference, its size and execution cost of the asso- 
ciated query. WATCHMAN uses also a complemen- 
tary cache admission algorithm, to determine whether 
a retrieved set currently not materialized in the cache 
should be admitted to the cache. We observe here that 
cache admission algorithms are absent from database 
buffer managers because most operating systems are 
unable to manipulate data directly on disk and thus 
every referenced page must be brought into the cache. 
However, a cache admission algorithm is important in 
our environment, especially in the presence of multi- 
ple query classes. For example, caching a retrieved 
set which is computed by performing a projection, 
and is relatively inexpensive to execute, may cause 
the eviction of several hundreds of sums and averages 
whose evaluation may have involved computing expen- 
sive multi-way joins. The cache admission algorithm 
employed in WATCHMAN uses a similar profit met- 
ric as in the cache replacement algorithm, with some 
modifications to deal with the absence of any refer- 
ence frequency information for newly retrieved sets. 
Furthermore, WATCHMAN interacts with the buffer 
manager by using hints supplied by the former to pro- 
vide feedback that can be used to improve the hit ratio 
of the latter. 

The remainder of the paper is organized as follows^ 
In Section 2 we discuss the main novel features of 
WATCHMAN, namely the cache replacement and ad- 
mission algorithms, and the interaction with the buffer 
manager. Section 3 discusses the current implemen- 
tation status. Section 4 reports on experiments per- 
formed on workloads based on the TPC-D [Tra95] and 
Set Query [0’N93] benchmarks. We compare our algo- 


rithms with a vanilla LRU strategy. Section 5 discuss, 
related work and we conclude the paper in Section 6 


2 WATCHMAN Design 

The design of WATCHMAN incorporates two con 
plementary algorithms: one for cache replacemen 

denoted as LNC-R (Least Normalized Cost Replac 
ment), and the second one for cache admission, d 
noted as LNC-A (Least Normalized Cost Admission 
The cache replacement algorithm LNC-R can be us< 
stand-alone or integrated with the cache admission a 
gorithm LNC-A. We shall denote the integrated alg< 
rithm as LNC-RA. Both algorithms aim at optimizn 
the query response time by minimizing the executn 
costs of queries that miss the cache. We proceed ni 
to discuss in more detail the two algorithms and th. 
we prove the optimality of LNC-RA within a simphfi. 

mnrtpl 


2.1 Cache Replacement Algorithm 

As discussed above, LNC-R and LNC-A aim at mi 
imizing the execution time of queries that miss t. 
cache instead of minimizing the hit ratio, as is ti 
case in buffer management. In buffer managemei 
the usual criterion for deciding which objects to caci 
is based upon their probability of reference in the 
ture Since future reference patterns are not availah 
in advance, the probability of a future reference is a 
proximated from a past reference pattern under t 
assumption that these reference patterns are stable, 
order to capture the actual execution costs (or savin;: 
of a retrieved set, LNC-R makes use of two additioi 
parameters in addition to the reference pattern. Thi 
LNC-R uses the following statistics for each retne\ 
set RSi corresponding to query Q t - 

• A,: average rate of reference to query Q,- 

• Si : size of the set retrieved by query Q t . 


• c t : cost of execution of query Q,. 

LNC-R aims at minimizing the cost savings 
(CSR) defined as 


CSR = 


E. c - r - 


ro 


where /i* is the number of times that references 
query Q, were satisfied from cache, and r, is the tc 

number of references to query Q,- 

To achieve this goal, the above statistics are cc 
bined together into one performance metric, cai 
profit , defined as 

A* ■ Cj 

profit (RS t ) — 
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Let us assume that retrieved set RSj with size Sj needs 
to be admitted to the cache and the amount of free 
space in the cache is less than sj. Then LNC-R sorts 
all retrieved sets held in the cache in ascending order 
of profit and selects the candidates for eviction in the 
sort order. The justification for this heuristic is as 
follows. For a given retrieved set RS Xy the term A, • c t 
determines 1 the query execution cost savings due to 
caching RSi. However, given two retrieved sets which 
provide the same cost savings, the larger retrieved set 
should be evicted first from the cache because it frees 
more space for storage of the newly retrieved set RS r 
As pointed out in [OOW93], the LRU cache replace- 
ment algorithm performs inadequately in the presence 
of multiple workload classes, due to the limited ref- 
erence time information it uses in the selection of the 
replacement victim. Consequently, [OOW93] proposed 
the LRU-K algorithm, which considers the times of the 
last K > 1 references to each page. To deal with the 
possibility of workload variations, WATCHMAN uses 
similar ideas to [CABK88, OOW93, SWZ94, VBW95] 
in order to estimate A* based on a moving average of 
the last K inter-arrival times of requests to RS X . In 
particular, we define as 


where t is the current time and tx is the time of the 
ff-th reference. Including the current time t in (3) 
guarantees the aging of retrieved sets which are not 
referenced. To reduce overhead, the estimate of A, is 
updated only when the retrieved set is referenced or 
at fixed time periods in absence of the former. 

Whenever less than K reference times are available 
for some retrieved set RS X , the average rate of ref- 
erence A* is determined using the maximal number of 
available references. However, since retrieved sets with 
fewer references have less reliable estimates of A t , the 
cache replacement algorithm gives them a higher pri- 
ority for replacement. In particular, the LNC-R algo- 
rithm first considers all retrieved sets having just one 
reference in their profit order, then all retrieved sets 
with two references, etc., as discussed in Figure 1. 

The size Si of retrieved set RS X is available at the 
time of its retrieval. The cost, c t , of retrieving RS X may 
be either provided directly by a query optimizer (in 
this case WATCHMAN is integrated with the DBMS) 
or can be calculated from the performance statistics 
exported by most commercial DBMSs (e.g. the num- 
ber of logical or physical block reads might be a good 
estimate of the cost if the query execution costs are 
dominated by disk I/O). 

1 After normalizing by A = A,. 


2.2 Cache Admission Algorithm 

The main goal of a cache admission algorithm is to 
prevent caching of retrieved sets which may cause re- 
sponse time degradation. For example, caching of a 
set retrieved by a multi-attribute projection of a large 
relation might evict the contents of the entire cache. 
This would cause a relatively costly re-execution of 
complex statistical queries, which originally occupied 
only minimal space in the cache. 

Ideally, WATCHMAN should cache a retrieved set 
only if it improves the overall profit. Given a set 
C of replacement candidates for a retrieved set RS X> 
WATCHMAN decides to cache RS X only if RS X has a 
higher profit them all the retrieved sets in C. Namely, 
RS X is cached only if the following inequality holds 

profit(RS,) > profit(C) (4) 

where the profit of list C is defined as 

profit(C) = (5) 

l^RS,eC S J 

Although the admission criterion defined by (4) is 
intuitive, its straightforward implementation may not 
be feasible. Namely, it is not clear how to calculate 
the average reference rate A, (and thus profit) for a 
newly retrieved set RSi . As we shall discuss in Sec- 
tion 2.4, WATCHMAN retains in many cases the ref- 
erence times of retrieved sets that are evicted from the 
cache. Thus, if RS t was previously cached, WATCH- 
MAN may calculate A* from the retained reference in- 
formation if the latter is available. If less than K ref- 
erence times are available, then A, is calculated using 
the maximal number of available samples. However, 
if RS X is retrieved for the first time, there is no infor- 
mation about past reference to RSj even if WATCH- 
MAN stored reference information of all prior submit- 
ted queries. In this case, WATCHMAN makes its de- 
cision based on the only information available about 
the newly retrieved set RS t : its size s, and the cost c x 
of execution of query Q x . We define for a retrieved set 
RS X an estimated profit as 

e-profit(RSJ = — (6) 

* 5 * 

WATCHMAN caches RS X only if the following inequal- 
ity is satisfied 

e-profit(f?5,) > e-profit(C) (7) 

where the estimated profit of a list C is defined as 

e-profit(C) = (8) 

Zrs,€C s j 
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Although the decisions based on (7) are purely 
heuristic, the experimental results in Section 4 show 
that they always improve WATCHMAN’S perfor- 
mance. In the sequel, we will refer to the LNC-R 
cache replacement algorithm coupled with the above 
admission algorithm LNC-A as LNC-RA. The com- 
plete pseudo-code of LNC-RA can be found in Figure 
1 . 


2.3 Optimality of LNC-RA Under a Con- 
strained Model 


We first state our assumptions about the model. Let 
{RSi , RS 2 , . . • , RS n } be the set of retrieved sets of all 
queries. We assume that the retrieved set reference 

string is a sequence of independent 

random variables with a common, stationary distribu- 
tion {Pl,P 2 . • • • iPn} suc h P r °b{ r i ~ RSk) — Pk 


for all t > 1. 

In order to minimize query response time, the cost 
incurred by execution of queries missing the cache 
should be minimized. Therefore, the optimal cache 
replacement algorithm should cache retrieved sets 
{RSi , i 6 /*}, /• C N= {1,2, such that 


min ^ Pi • c t 


(9) 


is satisfied, subject to the constraint 

£ s, < S (10) 

ie/* 


where S is the cache size 2 . 

The problem defined by (9) and (10) is equivalent to 
the knapsack problem, which is NP-complete [GJ79). 
Consequently, there is no efficient algorithm for solv- 
ing the problem. However, if we assume that sizes of 
cached retrieved sets are relatively small when com- 
pared with the total cache size S, and thus it is always 
possible to utilize almost all space in the cache.^then 
we can restrict the solution space only to sets /* sat- 



We show that under the assumption (11), the op- 
timal solution may be found by a simple greedy algo- 
rithm, which we term LNC*. LNC* constructs / in 
the following way: First, it sorts {RSi, RS 2 , ■ ■ - , RS„) 
in a descending order of p, ■ c,/s,. Then it assigns to 
/* items from the start of the list until the space re- 
quirement (10) is violated. We show that the solution 
/* found by LNC' is optimal. 

2 The optimal cache replacement algorithm may select the 
retrieved sets for caching statically because the probability of 
reference of each query is a priori known and stationary. 


Algorithm: LNC-RA 
Input: retrieved set RSi 

s i - size of RSi 

Cx - cost of execution of query (?, 
corresponding to RS, 
avail available free space in cache 
Variables: ri, - reference information holding last 

K reference times to RS X 
\ t - estimate of average inter-arrival ra 
of references to RS% calculated froi 
case (allocation state of RS.) 

RSi in cache: 

update ri, 

RSt not in cache and avail > s t : 
cache RSi 
update ri, 

K5, not in cache and avail < s,: 

LNC-A(R5.) 

Algorithm: LNC-A 

Input: retrieved set RS % 

C = LNC-R(s,) 
if (ri, in cache) then 
update ri, 

if (profit (RS t ) > profit (C)) then 
evict all retrieved sets in C 

If retain reference information 

cache RS X 

a 


else 


allocate ri, 
update rii 

if (e- profit (RSi) > e-profit(C)) then 
evict all retrieved sets in C 

II retain reference information 
cache R5, 

ft 


fi 


Algorithm: LNC-R 
Input^ space to be freed 

Output: C - list of candidate retrieved sets 

for replacement 
for t = 1 to K do 

R, = list of retrieved sets with exactly i refer? 
in ri arranged in increasing profit order 


od 

R = 


C = 

return C 


list of all retrieved sets arranged 
in order Ri < Ri < • • < Rk 
minimal prefix of R such that - 3 


Figure 1: Pseudo-code of LNC-RA. 
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Theorem 1 The LNC algorithm finds the optimal 
solution of the problem defined by (9) and (11) 

Proof: Constraint (9) is equivalent to 


max ^ Pi • Ci 
»€/* 


(12) 


Let / ^ /* be an arbitrary subset of N satisfying (11). 
We will show that £] i6/ Pi ' ^ — 5Zie/* Pi ' c ’’ ^> nce 
LNC* selects retrieved sets with maximal pi ■ c,/sj, it 
follows that 


E 


Pi - a 
s% 


sE 


Pi ' 
Si 


(13) 


We can assume that /* fl / = 0. If not, then the 
intersecting elements can be eliminated from both sets 
while preserving (13). We define 


Pmin ' Cmm _ m<n . €/ . £•_£• 


Smin 

Pmax ' Cmaz 


— maXizj 


Si 
Pi *Ci 


(14) 

(15) 


■Smox 

Since /* contains retrieved set references with maximal 
Pi-Ci/si and mi = 0, it must be true that > 

Consequently, 

fm«i 


E . Pmin * Cfnin r* s. 

pici --7— s - 


Pmaz ' Cm ax 




s > y^p.c, 


< 6 / 


(16) 

Therefore, we have shown that the set /* constructed 
by LNC* indeed maximizes (12). □ 


We argue that the LNC-RA algorithm used by 
WATCHMAN approximates LNC*. First, we point 
out that pi = A,/A where A = 5 Z<g/v^' - Since 
the probability distribution {pi,P2. • • • ,Pn] is in gen- 
eral unkn own, the LNC-RA algorithm approximates 
it by using a sample of last K references to each 
retrieved set RSi. Thus the reference rate statis- 
tics maintained by LNC-RA approximates the distri- 
bution {pi,P 2 i • ■ • iPn} 33 K grows to infinity. Un- 
like LNC*, the LNC-RA algorithm constructs the set 
/ of cached retrieved sets on-line. If the distribu- 
tion {pi,P2,--.,Pn} is stationary, then the set / con- 
structed by LNC-RA converges to /* for sufficiently 
long reference strings (and K — > oo). 

The optimality result in this section is an asymp- 
totic one. We further study the transient behavior of 
LNC-RA in Section 4 by comparing its performance 
with vanilla LRU on the TPC-D and Set Query bench- 
mark workloads. 


2.4 Retained Reference Information Problem 

In the design of the LRU-K page replacement algo- 
rithm, [OOW93] point out a form of starvation, which 


they term a “retained reference information problem” 
We recast the problem in our setting: Assume that 
K > 1. Whenever a new retrieved set RSi is cached, it 
is among the first candidates for replacement as it has 
only incomplete reference information (i.e. it has fewer 
than K reference times). If the reference information is 
evicted from the cache together with the retrieved set 
RSi , then upon re-referencing RS,, the reference in- 
formation must be collected again from scratch. Con- 
sequently, RSi is likely to be again evicted. There- 
fore, the cache replacement algorithm cannot collect 
sufficient reference information about RSi to cache it 
permanently, irrespective of its reference rate. 

[OOW93] propose to retain the reference informa- 
tion of retrieved set RSi even after RSi has been 
evicted from cache. Thus after K references, there 
is enough reference information to cache RSi perma- 
nently, provided its reference rate is sufficiently high. 
To limit the total size of retained reference informa- 
tion, [OOW93] propose to cache the retained refer- 
ence information only for certain period after the last 
reference to the retrieved set. They suggest the Five 
Minute Rule [GP87] as a possible guideline for select- 
ing the length of the period. 

However, using a timeout based on the Five Minute 
Rule leads to two problems in our setting. First, the 
same period of time should not be used for retaining 
all reference information. The retained reference infor- 
mation associated with retrieved sets of large size that 
are cheap to materialize is of little value and should 
be dropped relatively soon, while the retained refer- 
ence information related to small retrieved sets that 
are expensive to materialize is valuable and should 
be kept for longer periods. Second, a timeout period 
based solely the Five Minute Rule does not take into 
account the available cache size: For example, when 
the cache is small, the retained reference information 
must be evicted even earlier than 5 minutes after the 
last reference. 

Both problems can be resolved by a relatively sim- 
ple policy: 

• Retained reference information related to re- 
trieved set RSi is evicted from cache whenever 
the profit associated with RSi is smaller than the 
least profit among all cached retrieved sets. 

To be able to calculate the profit, the retrieved set size 
Si and the cost of execution of Q, must be retained 
along with the reference information to RSi ■ When 
evaluating the profit of a retrieved set which has less 
than K reference times, we use the maximal available 
number of reference times as in Section 2.2. 

Clearly, retaining reference information related to 
retrieved sets with profits smaller than the least profit 
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among all cached retrieved sets does lead not to per- 
formance improvement because such retrieved sets 
would immediately become candidates for replace- 
ment, should they be cached. 

The policy also addresses the two aforementioned 
problems: first, retained reference information related 
to large retrieved sets which are cheap to materialize 
is kept for only a short period of time as their prof- 
its are small. At the same time, retained reference 
information related to small retrieved sets that are ex- 
pensive to materialize is retained longer as their profits 
are large. Second, the cache space occupied by the re- 
tained reference information is scaled with the total 
cache size. Should the size of the retained reference 
information become too large compared with the total 
cache size, the cache size left for storage of retrieved 
sets shrinks and therefore the least profit of a cached 
retrieved set increases, which in turn leads to evic- 
tion of more retained reference information. Similarly, 
should the size of the retained reference information 
become too small, the least profit of a cached retrieved 
set decreases and the policy for caching retained ref- 
erence information becomes more liberal. 

A similar starvation problem may also arise when 
the admission algorithm determines not to cache a re- 
trieved set. In this case, the reference information re- 
lated to the set is retained and its residence in the 
cache is guided by the above described policy. Conse- 
quently, a retrieved set that is initially rejected from 
cache may be admitted after a sufficient reference in- 
formation is collected. 


3 WATCHMAN Implementation 

WATCHMAN is implemented as a library of routines 
that may be linked with an application (e.g. a data 
warehouse manager). Consequently, it is relatively 
simple to add the WATCHMAN functionality oil top 
of an existing DBMS. Each cache entry consists of 
query ID, array of K timestamps, retrieved set size, 
cost of execution of the query, and a pointer to the 
retrieved set. A query ID consists of the query string 
(compressed by substituting all delimiters with a sin- 
gle special character). In general, retrieved sets may 
be stored either in main memory or on secondary stor- 
age. The current version of WATCHMAN stores all 
retrieved sets in main memory primarily to simplify 
storage management. 

In order to test whether a retrieved set of a given 
query is cached, WATCHMAN employs an exact query 
ID match. The cache hit ratio (and thus also the save- 
up of query execution costs) can be improved by test- 
ing for query equivalence. However, the query equiv- 
alence problem was shown to be NP-hard [SKN89). 
Several algorithms for testing special cases of query 


equivalence were developed [CR.94, GHQ95]. Any ^ 
these algorithms could be adopted in WATCHMA.' 
However, even the exact syntactic match might be pr 
hibitively expensive if calculated for all retrieved set 
To speed up cache lookup, we add to each cache e 
try a signature, which is computed as a hash functn 
over the query ID. Consequently, only the cache e 
tries having a signature identical with the looked t 
query need to be tested. 

Although updates in data warehouses are not as fr 
quent as in OLTP databases [IK93, Fre95], they st 
affect cache coherence. The current design of WATC1 
MAN assumes that the warehouse manager detec 
whether the update is relevant to the cache conte 
and modifies the retrieved sets that are affected 1 
the update. The retrieved set modification can be d 
termined either by executing the corresponding que 
from scratch or by detecting only incremental modi 
cations (see [GM95] for a review of such techniques 

It is possible that some of the pages buffered di: 
ing exe-uHon of query Q, are redundant because t 
retrieved set RS, is cached by WATCHMAN. If su 
a page is not used by any other query, then its reft 
ence rate decreases and thus it should be eventual 
dropped from the buffer pool. However, it is concei 
able that WATCHMAN provides hints to the buff 
manager by instructing it to evict those pages whn 
are used mostly by queries whose retrieved sets a 
cached. Such hints, if correct, may free the buff 
space faster and thus improve the buffer manage: 
performance. 

We designed a simulation testbed to study the int< 
action between WATCHMAN and the buffer manag. 
The buffer manager implements the LRU page replat 
ment algorithm. In addition, the buffer manager tak 
advantage of the hints sent from WATCHMAN ai 
moves selected pages to the end of the LRU chai 
For the purpose of simulation, WATCHMAN mai 
tains with every buffered page its query reference s 
which consists of ID’s of all queries that referenced t 
page. We say that a page is p-redundant if at least ; 
of its query relevant set is cached by WATCHMA 
After caching a retrieved set, WATCHMAN sends 
hint to the buffer manager to move all po -redundc 
pages, for a fixed threshold po, to the end of its LI 
chain. We currently investigate various compressi 
and sampling techniques to minimize the amount 
information necessary to compute the query referei 
set of each buffered page. Our preliminary experinu 
tal results in Section 4 show that such a cooper at i 
between WATCHMAN and the buffer manager ind. 
improves performance of the latter. 
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4 Performance Experiments 

4.1 Experimental Setup 

We tested the performance of WATCHMAN on traces 
based on TPC-D [Tra95] and Set Query [0’N93j 
benchmark workloads. The traces were gathered using 
Oracle 7 DBMS running on a HP 9000/700 worksta- 
tion. 

Databases 

We used databases of total size 30 Mbytes for 
TPC-D benchmark and 100 Mbytes for Set Query 
benchmark 3 . The relations were populated with syn- 
thetic data according to the benchmark specifications 
[TYa95, 0’N93]. We had to scale down sizes of both 
databases from their suggested 1 Gbyte (TPC-D) and 
200 Mbytes (Set Query) sizes because of the excessive 
time it took to collect traces of sufficient length. 

Workload Traces 

Each trace consists of a total of 17000 queries. With 
each query we recorded in the trace a timestamp of the 
retrieval time, query ID (see Section 3 for details), size 
of the retrieved set and execution cost of the query. We 
assumed that the query execution costs are dominated 
by disk I/O. Therefore, we set the query execution 
cost to the number of buffer block reads performed 
during execution of the query. By considering block 
reads from the buffer manager rather than physical 
disk block reads, we made the cost estimate indepen- 
dent of the current state of buffer manager. Conse- 
quently, the execution cost of each query is given by 
the number of disk block reads which would be done 
if no buffers were available. 

The TPC-D queries are in fact query templates 
which are instantiated with parameters generated ran- 
domly from pre-defined intervals. Therefore, the trace 
is obtained by running 17000 instances of the query 
templates with random parameters generated accord- 
ing to the benchmark specification rules. Because 
the parameter intervals are of different sizes, the to- 
tal number of instances for each query template varies 
substantially from an order of 10 to an order of 10 
Consequently, the trace captures the “drill-down anal- 
ysis” query distribution [IK93]: queries at high sum- 
marization levels repeat frequently within each trace, 
while queries at low summarization levels do not re- 
peat at all. Because we view the problems of cache 
coherence as independent of the problems of cache 
replacement and admission studied in this paper, we 
excluded the two update templates from TPC-D and 
used only the remaining 17 query templates. 

3 The reported sizes do not include indices. 


Although the Set Query benchmark also consists of 
several query templates, the total number of all in- 
stances does not exceed 100. Consequently, we modi- 
fied the parameterization of the Set Query benchmark 
to obtain a larger instance space. Similarly to TPC-D, 
we modeled the “drill-down analysis” query distribu- 
tion. 

Performance Metrics 

The cost savings ratio (CSR) defined in Section 2.1 is 
the primary performance metric in all reported results. 

As a secondary metric we use the cache hit ratio 
(HR) defined as 

HR = (I 7 ) 

E, r * 

where h, is the number of times that references to 
query Q, were satisfied from cache, and r, is the total 
number of references to query Q t . 

As a tertiary metric we consider the average exter- 
nal fragmentation of a cache which is defined as the 
average fraction of unused cache space. 

4.2 Experimental Results 
Infinite Cache 

We ran experiments with an unlimited cache size in or- 
der to study the potential of caching in our traces. The 
results in Figure 2 show that cost savings and hit ratios 
are relatively high on both traces indicating that both 
traces have a high reference locality. The Set Query 
benchmark trace yields a smaller hit ratio than TPC- 
D, but a higher cost savings ratio. We believe that 
this is due to the fact that all TPC-D queries perform 
costly joins, while many Set Query queries are inex- 
pensive projections. Consequently, the distribution of 
query execution costs is more skewed in the Set Query 
benchmark. 


CSR 

HR 

cache size 

db size 

TPC-D 

0.73 

0.72 

17.7 MB 

30 MB 

SQ 

0.92 

0.65 

16.1 MbI 

100 MB 


Figure 2: Performance with infinite cache. 


Impact of selection of K 

Selection of a larger K improves the estimates of re- 
trieved set reference rates. Consequently, it leads to 
an improvement of both cost savings and hit ratios. 
In Figure 3 we illustrate a typical behavior on exper- 
iments with a cache size set to 1% of database size. 
The improvement is quite strong in the case of LRU-K 
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, 48 1% on TPC-D and 29.2% on Set Query). Some- 
lhat surprisingly, the improvement of LNC-RA is not 
as strong (9.2% on TPC-D and 3.1% on Set Query). 
We conjecture that this is due to the relative simplicity 
of our workloads. The choice of K could play a more 
significant role under multi-class workloads in which 
each class has different reference characteristics. 



Set Query: cache size » 1% database size 


LNC-RA 
• LRU 


—A— LNC-RA 
♦ ■ LRU 


Figure 3: Impact of K on performance. 



TPC-D: cost savings ratio 



Set Query: cost savings ratio 
Figure 4: Cost Savings Ratios. 
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Algorithm Performance Comparison 

We studied the performance improvement of LNC- 
RA when compared with the vanilla LRU. A compar- 
ison of the performance of LNC-RA with LNC-R is 
also of interest, since LNC-RA makes heuristic de- 
cisions and thus it is not a priori clear whether its 
performance is always better than the performance of 
LNC-R- The cost savings and hit ratios of LNC-RA, 
LNC-R (with K set to 4) and vanilla LRU {K = 1) for 
various cache sizes can be found in Figures 4 and 5, 
respectively. We considered cache sizes ranging from 
0.1% to 5% of database size. This is a realistic assump- 
tion for data warehouses with sizes on the order of 1 - 
10 Gbytes. For comparison, we include in each grap 
also the maximal cost savings and hit ratios that can 
be achieved with an infinite cache (inf). 

The LNC-RA algorithm provides consistently bet- 
ter performance than LRU. LNC-RA achieves cost sav- 
ings ratios that are, on average, 4 times better on the 


TPC-D trace and 2.3 times better on the Set Que 
trace when compared with the vanilla LRU! The n 
provement obviously diminishes with the cache size: 
is maximal for the smallest cache size, when LNC-F 
improves LRU cost hit ratio by factor of 4.7 on t 
TPC-D trace and 7 on the Set Query trace. LNC-F 
also exhibits similar performance improvement for 1 
ratios as shown in Figure 5. However, cost savings i 
tios converge much faster to the maximal achieval 
level when compared with the hit ratios. 

Although the cache admission policy makes heur 
tic decisions in absence of reference information abc 
newly retrieved sets, it always improves the overall p 
formance, as Figures 4 and 5 show. LNC-RA achie- 
cost hit ratios that are, on average, a 32% improvers 
over LNC-R on TPC-D trace and a 6% improvem. 
on Set Query trace. Again, the improvement dim 
ishes with the cache size: The maximal improvem. 
is 88% on TPC-D trace and 30% on Set Query tra 
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— * — LNC-RA 
— m — LNC-R 
♦ - LRU 


0.8 



cadkc fisc (% of database size) 


Set Query: hit ratio 
Figure 5: Hit Ratios. 


External Cache Fragmentation 


The optimality results from Section 2.3 rely on the 
fact that the total unused cache space due to exter- 
nal fragmentation is negligible. We therefore studied 
experimentally the degree of external fragmentation. 

As Figure 6 shows, the external fragmentation of 
LNC-RA is indeed negligible: the fraction of used 

space does not drop below 96% and typically remains 
at 98.5 %. LNC-R and LRU cannot prevent caching 
of large retrieved sets because they do not employ any 
cache admission algorithm. Consequently, they ut lze 
storage space less efficiently than LNC-RA, but t eir 
external fragmentation is still relatively insign cant, 
the fraction of used space never drops below 88% and 
on average stays at 94.8%. 




Size <% of datobas* six*) 
TPC-D 



cache size (% of database size) 


Set Query 

Figure 6: External Fragmentation. 


Interaction with the Buffer Manager 

We studied the impact of using the hints sent froin 
WATCHMAN to the buffer manager on performan 
of the latter, namely its hit ratio. Recall tha e 
hint consists of ID’s of all pages that are po redundant 
for a fixed level of Po- Upon receipt of such a hint, the 
buffer manager moves all the qualifying pages o 
end of its LRU chain. We report here our preliminary 
results. We simulated an environment with 15 Mbyte 
page buffer pool, 15 Mbyte WATCHMAN cache and 
14 relations of total size 100 Mbytes. The workload 
consisted of 17000 queries run against the database 
resulting in more than 26 million page references Due 
to space limitation, we refer to [SSV96] for additional 
details on the experimental setup. 

In our experiments, we observed the buffer man- 
ager hit ratios as we decreased the threshold po from 
100% to 0%. The experimental results can be > oun 
in Figure 7. We found that by using the hints it 
possible to improve the buffer manager hltratl °J r °^ 
0 71 up to 0.80 when po = 60%. However, further de- 
crease of po leads to eviction of pages that are used by 
many other queries. Consequently the bu ffj manager 
hit ratio drops down to 0.40 when the modified LRU 
degenerates to MRU (po = 0%)- Therefore, WATCH- 
MAN’S hints indeed have a potential to improve the 
performance of buffer manager 
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Figure 7: Effects of Hints on Buffer Perfor- 

mance. 

5 Related Work 

Sellis studied cache replacement algorithms in the con- 
text of caching retrieved sets of queries with proce- 
dures [Scl88]. He suggested that the algorithms should 
also consider retrieved set size and cost of query execu- 
tion in addition to the reference rate. Several cache re- 
placement algorithms were proposed which either rank 
the retrieved sets using only one of the parameters or 
a weighted sum of all of them. However, no guideline 
for setting the weights is provided. Unlike LNC-R, the 
proposed algorithms do not maximize query execution 
cost savings. The performance of the algorithms is not 
studied either analytically or experimentally. Caching 
of retrieved sets of queries containing either proce- 
dures or method invocations was subsequently studied 
in [Jhi88, Hel94, KKM94]. However, the work con- 
centrates primarily on cache organization, integration 
with query optimization, and update handling rather 
than on the design of cache replacement and admission 
algorithms. 

Keller and Basu propose a cache replacement algo- 
rithm for materialized predicates which is similar to 
LNC-R [KB96]. Unlike LNC-R, however, it considers 
only the last reference to each predicate. The perfor- 
mance of the algorithm is not studied either analyti- 
cally or experimentally. No explicit cache admission 
algorithm is considered. 

The ADMS database system benefits from caching 
at multiple levels [CR94, RCK+95]. Both retrieved 
sets and pointers to their tuples may be cached. Effi- 
cient algorithms for both cache updating and testing of 
a limited form of query equivalence are designed. LRU, 
LFU and Largest Space Required (LCS) replacement 
algorithms are adopted and their performance is ex- 
perimentally studied [CR94]. The experimental results 
indicate that LRU consistently provides the worst per- 
formance, while LCS the best. This is in accord with 
our experimental findings which show that the more 


between WATCHMAN and buffer manager. 


information a cache replacement algorithm uses, the 
better the performance it achieves. However, unlike 
LNC-RA, none of these algorithms aims at maximiz- 
ing the query execution cost savings. 

Harinarayan et. al. design an algorithm for se- 
lective pre-computation of decision support queries 
[HRU96]. Their algorithm minimizes the storage re- 
quirements. However, it does not take into account 
workload characteristics. We view this work as com- 
plementary to ours. Certainly, it is beneficial to bring 
some retrieved sets to cache before they are referenced. 
However, on demand caching is also important due to 
its ability to dynamically adapt to the workload char- 
acteristics. 

Design of efficient buffer replacement algorithms 
has gained lots of attention [LWF77, EH84, Sto84, 
CD85, OOW93, FNS95]. In particular, the LRU-K 
cache replacement algorithm [OOW93] is closely re- 
lated to LNC-RA in its use of last K reference times 
to every cached object. However, unlike LNC-RA the 
buffer replacement algorithms rely on an uniform size 
of all pages and an uniform cost of fetching each page 
into the cache. The sliding window estimate of re- 
quest arrival rates similar in Section 2.1 is similar to 
the notion of “heat” used in several distributed DBMS 
projects [CABK88, SWZ94, VBW95]. 

To the best of our knowledge, none of the previous 
works formulated an integrated cache replacement and 
admission algorithm which consider the last K refer- 
ence times to each retrieved set, as well as, a profit 
metric incorporating our statistics. Furthermore, no 
previous works evaluated the performance benefits of 
using such algorithm on standard decision support 
benchmarks. 

6 Conclusions and Future Work 

We have presented the design of an intelligent data 
warehouse cache manager WATCHMAN. WATCH- 
MAN employs novel cache replacement and cache ad- 
mission algorithms. The algorithms explicitly consider 
retrieved set sizes and execution costs of the associ- 
ated queries in order to minimize the query response 
time. We have shown the optimality of the cache re- 
placement and admission algorithms within a simpli- 
fied model. We evaluated the performance of WATCH- 
MAN experimentally using the TCP-D and Set Query 
benchmarks. 

In summary, the experimental results show that 
the cache replacement algorithm used by WATCH- 
MAN, LNC-RA, improves the cost savings ratio, on 
average, by a factor of 3, when compared with the 
vanilla LRU. The cache admission algorithm LNC-A, 
although based on a heuristic, improves the cost sav- 
ings ratio by an average of 19%. Using more than 


60 


the last reference time to a retrieved set improves cost 
savings ratio of LNC-RA on average by 5%. Exter- 
nal cache fragmentation of LNC-RA is minimal (less 
than 4% of the cache size). Therefore, the assump- 
tions made in Section 2.3 are justified. We also show 
that the WATCHMAN’S hints can improve the perfor- 
mance of buffer manager. 

We are currently investigating the following topics: 

• Multiclass workloads. Our experiments show that 
the performance improvement by selecting K > 1 
is relatively insignificant. When generating the 
query stream, we attempted to maximally adhere 
to the benchmark specification rules. However, 
such a workload fails to model an environment 
with a query stream consisting of multiple classes 
of queries, each with a different reference char- 
acteristics. It has been argued in [OOW93] that 
this is the type of environment in which retaining 
more than the last reference is most beneficial. 
We intend to study such workloads. 

• Query equivalence testing. The cache hit ratio 
(and thus also the cost savings ratio) can be im- 
proved by testing for some special cases of query 
equivalence rather than looking only for an exact 
query match. An ideal test should cover a suf- 
ficiently wide range of equivalence cases, but at 
the same time incur only minimal overhead. To 
our best knowledge, only a single testing method 
has been developed for queries with aggregates 
[GHQ95]. However, this method, based on a set 
of rewrite rules, appears to be too expensive to 
be used in our setting. We therefore intend to 
pursue the development of a simpler method for 
WATCHMAN. 
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